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[57] ABSTRACT 

The system and method of the present invention uses a 
zero-crossing rate measurement in order to determine the 
initiation and/or termination of speech in an audio signal 
input. It is especially well suited for detecting the termina- 
tion of a telephone message in a telephone answering device. 
Specifically, a sample of the zero-crossing rate signal is 
determined by counting the number of consecutive speech 
samples required for the occurrence of a pre-defined number 
of consecutive zero-crossings. The resultant zero-crossing 
rate signal is smoothed and applied to a differentiator. A 
short-time magnitude integration is performed to measure 
the energy in the differentiated signal. The output of the 
magnitude integration is provided to a threshold detector 
which produces a sequence of decision values indicating the 
presence or absence of speech. Finally, the decision values 
are filtered to produce a more definitive sequence of final 
decision values. 

22 Claims, 5 Drawing Sheets 
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DETECTION OF TONAL SIGNALS 
FIELD OF THE INVENTION 

Jhc present invention relates generally to the field of 
speech detection, and more specifically to an improved 5 
system and method for detecting initiation and/or termina- 
tion of a speech message in a voice storage device or 
telephone answering device. 

DESCRIPTION OF THE RELATED ART 

10 

Telephone answering machines are a fundamental artifact 
of the modem life -style, A fundamental problem connected 
with answering machine performance is that of detecting the 
end of a message. Since the answering machine employs a 
finite storage media (tape or RAM), to record in -coming 
speech messages, it is essential that the answering machine 
be able to accurately detect the end of these messages. The 
end of a message can occur in many ways, but the result is 
nearly always some form of tonal sequence (i.e. sequence of 
tones) or background noise (silence). For the sake of 
discussion, this end of message signal, which ensues upon 
the conclusion of the speech signal, will be called the 
termination signal. It is simple to distinguish silence from 
speech by the use of a simple energy measure. Background 
noise usually has much smaller power, and thiis energy, than 25 
a speech signal. However, tonal signals, which represent the 
most typical termination signal, contain high energy. Thus 
the energy measure fails as a general technique for distin- 
guishing speech from tennination signals. 

The problem of detecting the end of a message is com- 30 
pounded by the fact that the nature of the tones is best 
assumed to be unknown. Dial tone is the most common 
result, but this varies from country to country, and may even 
vary across private branch exchanges (PBX's). Other signals 
may also occur which may have an on-off cadence, and 35 
which may contain a variety of frequencies. 

It should be noted that the problem of detecting the 
termination of speech in an answering machine message is 
part of the more general problem of detecting the initiation 
and termination (i.e. the endpoints) of speech in a noise 40 
environment. One prior art endpoint detection system 
employs zero-crossing rate (ZCR) and short-time energy 
measurements with statistically determined detection thresh- 
olds [Rabiner and Schafer, Digital Processing of Speech 
Signals, pages 130-133, published by Prentice-Hall, ISBN 45 
0-13-213603-1, TK7882.S65R3]. In particular, Rabiner & 
Schafer disclose an algorithm for detecting the endpoints of 
an isolated speech utterance which involves computing a 
zero-crossing rate signal and an average magnitude signal 
based on the signal of interest. The zero-crossing rate signal 50 
is calculated using a moving window with 10 millisecond 
time-width: the number of zero-crossings in a 10 millisec- 
ond window is reported as a measure of the local zero- 
crossing rate. Similarly the average magnitude signal is 
calculated using a moving window with a 10 millisecond 55 
time-width: a weighted sum of the magnitudes (absolute 
values) of samples in a window is reported as a measure of 
local energy. 

The zero-crossing rate and average magnitude signals are 
assumed to contain no speech content during an initial 60 
training period. The zero-crossing rate signal and average 
magnitude signal samples during this training period are 
subjected to a statistical analysis to determine two different 
average magnitude thresholds and one zero-crossing rate 
threshold. The algorithm uses the two average magnitude 65 
thresholds and the zero -crossing rate threshold to determine 
the endpoints of a speech utterance in the signal of interest. 
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2 

The algorithm operates as follows. First, the average 
magnitude signal is searched to determine a maximal inter- 
val [A3] with the property that the average magnitude 
signal exceeds the larger magnitude threshold everywhere 
on the interval. Second, the endpoints of the maximal 
interval are extended outward to points where the average 
magnitude signal falls below the smaller magnitude 
threshold, defining interval [C,D], ITiird, the zero-crossing 
rate signal is consulted to possibly extend the endpoints even 
further. Namely, in the zero-crossing rate signal, the 25 
samples immediately to the left of (preceding) C are 
searched. If the zero-crossing rate signal exceeds the zero- 
crossing rate threshold three or more times in the 25 
samples, the start point C is moved to the location of the first 
such exceeding. Similarly, the furnish point D is condition- 
ally moved to the right. 

Thus, the algorithm disclosed by Rabiner & Schafer 
apparently uses the observation that speech is associated 
with higher zero -crossing rate and higher average magnitude 
(or energy) than background noise. Thus the algorithm of 
Rabiner & Schafer is unlikely to perform adequately in 
situations where the background noise has power and zero- 
crossing rate comparable to that of the speech signal. Thus 
a system and method are needed whereby the initiation 
and/or termination of a speech signal may be detected in a 
noise environment where the noise is not necessarily of low 
zero-crossing rate or low energy. In particular, a system and 
method are needed whereby the termination of speech may 
be detected in a telephone message. 

SUMMARY OF THE INVENTION 

The system and method of the present invention uses a 
zero-crossing rate measurement in order to determine the 
initiation and/or termination of speech in an audio signal 
input. The present invention is especially well suited for 
detecting the termination of a telephone message in a 
telephone answering device. Specifically, a sample of the 
zero-crossing rate signal is determined (a) by counting the 
number of consecutive speech samples required for the 
occurrence of a pre-defined number of consecutive zero- 
crossings, or (b) by counting the number of zero-crossings 
occurring in a pre-defined number of consecutive speech 
samples. The former calculation gives a zero<rossing 
period and the later gives a zero-crossing rate. However the 
distinction is not significant to the present invention. The 
resultant zero-crossing rate signal is smoothed and applied 
to a differentiator. An energy signal is then produced from 
the differentiated signal, by measuring the energy in the 
differentiated signal over a moving window in time. ITiis 
energy measurement captures the amount of variation of the 
zero-crossing rate signal. A short-time magnitude integration 
is performed to measure the energy in the differentiated 
signal. 

Speech has a time-varying spectrum and hence also a 
time-varying zero-crossing rate. Hence, while speech energy 
is present in the audio input, the energy measurements 
should report large values. In contrast, the non-speech signal 
which ensues at the end of a telephone call after speech has 
terminated is a mixture of tones, multi-tones, and Gaussian 
noise, having a locally constant spectmm and thereby a 
locally constant zero-crossing rate. ITius, when the speech 
signal is absent, the energy measurements should report 
small values. By applying the energy measurements to a 
threshold detection device, the present invention produces a 
sequence of decision values indicating the presence or 
absence of speech. 

Furthermore, the present invention preferably includes 
filtering the sequence of decision values. By examining a 
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moving-window of K consecutive decision values, a tion is shown. The zero-crossing rate calculator 110 operates 
sequence of "final" decision values may be asserted. on the input signal to produce a zero-crossing rate signal. 
Namely, in each window the decision values which indicate The zero-crossing rate calculator 110 comprises a faise- 
the presence of speech are coimted. When the count exceeds crossing pre-filter 210 and a zero-crossing rate measurement 
a first threshold J, then a final decision is asserted indicating 5 unit 220. The false-crossing pre-filter 210 is coupled to the 
the presence of speech. Conversely, when the count is input 105. Also the false-crossing pre-filter 210 is coupled to 
smaller than a second threshold I, a final decision is asserted the zero-crossing rate measurement unit 220. The zero- 
indicating the absence of speech. crossing rate measurement imit 220 has an output which is 
BRIEF DESCRIPTION OF THE DRAWINGS ~"P''='' '° differentiation unit 110. 

10 The false-crossing pre-filter 210 receives the input signal 

A better understandmg of the present mvenUon can be the input 105, and serves to map low ampUtude input 
obtamed when the foUowmg detailed description of the samples to zero. This pre-filtering ehminates spurious zero- 
preferred embodunent is considered in conjunction with the crossings due to noise, especially during the low level part 
following drawings, in which: of a dual tone beat. The false -crossing pre-filter 210 operates 

FIG. lAis a block diagram of a speech signal detector 100 15 on each input sample to produce an output sample according 

according to the present invention; to the follow rule: if the absolute value of an input sample 

FIG. IB provides a motivation of the present invention by is smaller than a fixed threshold, the output sample is set to 

means of a zero-crossing rate signal depicted during a zero, else the output sample is equal lo the input sample. The 

transition from speech to non-speech; output signal thereby produced is referred to the modified 

FIG. 2 is a block diagram of the zero-crossing rate '^^ inpwt signal, 

calculator 110 according to the present invention; The zero-crossing rate measurement unit 220 receives the 

FIG. 3 is a block diagram of the differentiation unit 120 modified input signal from the false-crossing pre-filter 210 

according to the present invention; and produces a zero-crossing rate signal. The zero-crossing 

FIG. 4 is a block diagram of the discriminator 130 25 ^^^^ ^^^^^ comprises a sequence of ZCR samples. A ZCR 

according to the present invention; sample is calculated by counting the number of samples 

HG. 5 is a speech storage device 500 according to the required for the occurrence of L successive zero-crossings in 

present invention* signal, where L is a system defined constant. Thus 

HG. 6 is a block diagram of a telephone answering device ' ^"™P^" ^^'^^7 J^^"^""'?^ zxro-crossing 

600 according to the present invention; and 30 ^"^^""^^ d^tmctjon between zero-crossmg rate 

rrr- T * ui 1 / c r J i_ J- . r and period IS not significant for the present invention. In an 

HG. 7 IS a block diagram of a preferred embodiment of ^^^^^^^ equivalent embodiment of the invention, a ZCR 

the speech signal detector 100 according to the present ^^ ^^ calculated by counting the number of zero- 

mvention. . ... - j i-,, 

crossmgs which occur m a window of M successive samples 

DETAILED DESCRIFTION OF THE of the input signal, where M is a system defined constant. 

PREFERRED EMBODIMENT Referring now to FIG. IB, a motivation of the present 

Referring now to FIG. lA, a block diagram of a speech invention is provided by means of a zero-crossing rate signal 
signal detector 100 according to the preferred embodiment depicted during a transition from speech to non-speech, 
of the present invention is shown. The speech signal detector Notice that speech is associated with a time-varying zero- 
100 comprises an input 105, a zero-crossing rate calculator 40 crossing rate (ZCR), while the tonal signals and/or noise, 
110, a differentiation unit 120, a discriminator 130, and an which occur after the speech message, have relatively con- 
output 140. The zero-crossing rate calculator 110 is coupled stant zero-crossing rate. By performing a differentiation 
to input 105. The zero-crossing rate calculator 110 is also operation, the intrinsic variation (rate of change) of the 
coupled to the differentiation unit 120. The differentiation zero-crossing rate signal is exposed. Furthermore, by per- 
unit 120 is coupled to the discriminator 130. And the 45 forming a moving-window integration of the absolute value 
discriminator 130 is coupled to the output 140. (magnitude) of the differentiated signal, the variation in the 

An input signal is supplied to the speech signal detector zero-crossing rate is monitored on a continuous basis. A 

100 through input 105. In the preferred embodiment of the ^^^^^ ^^^^^ magnitude integration indicates the 

invention, the input signal is a digitized telephone signal, presence of speech, and a small value indicates the absence 

The zero-crossing rate calculator operates on the input signal 50 speech. 

to produce a zero-crossing rate signal. A sample of the Referring now to FIG. 3, a block diagram of the differ- 
zcro-crossing rate signal provides a measure of local zero- entiation unit 120 according to the present invention is 
crossing rate in the input signal. The zero-crossing rate presented. The differentiation unit 120 uses the zero- 
signal is provided to differentiation unit 120. The differen- crossing rate signal received from the zero-crossing rate 
tiation unit 120 uses the zero-crossing rate signal to calculate 55 calculator 110 to calculate a differentiated zero-crossing rate 
a differentiated zero-crossing rate signal. The differentiated signal. The differentiation unit 120 comprises a smoothing 
zero-crossing rate signal measures the variation (or rate of filter 310 and a differentiator 320. The smoothing filter 310 
change) of the zero-crossing rate signal. The differentiated is coupled to receive the zero-crossing rate signal fi-om the 
zero-crossing rate signal is supplied to the discriminator zero-crossing rate calculator 110. Also the smoothing filter 
130. llie discriminator 130 uses the differentiated zero- 60 ^ coupled to the differentiator 320, The differentiator 
crossing rate signal to determine the instantaneous presence has an output which is coupled to the discriminator 130. 
or absence of speech in the input signal. An output signal. The smoothing filter 310 operates on the zero-crossing 
reflecting the instantaneous presence or absence of speech in rate signal and produces a filtered zero<rossing rate signal, 
the input signal, is provided by discriminator 130 via output In the preferred embodiment of the invention, the smoothing 

65 filter is an N-tap median filter (N=3). The purpc^e of the 

Referring now to FIG. 2, a block diagram of the zero- median filter is to remove outlying values from the zero- 

crossing rate calculator 110 according to the present inven- crossing rate signal. This type of filtering (a) increases the 
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smoothness of the zero-crossing rate signal when the input The threshold detector 420 compares the resultant 

signal has a constant spectrum (as occurs for tonal (integration) values comprising the detection signal to a 

sequences), and (b) leaves the zero-crossing rate signal fixed detection threshold R, and generates a sequence of 

relatively unchanged when the input signal is speech— since decision values. If a resultant value exceeds the threshold R, 
speech has a dynamic spectrum. 5 the corresponding decision value is assigned a symbol which 

Hie filtered zero-crossing rate signal is provided to the i^^^^icates Oie presence of speech. If the resultant value does 

differentiator 320. The differentiator 320 performs a differ- i'^^ exceed the threshold R the corresppndmg decision value 

entiation operation on the filtered zero-crossing rate signal f "^'^^ ^ T^H■^'''^'°.K ''f ' m ^ u 

„j ■ * J • . • 1 1 In the preferred embodunent, the detection threshold R takes 

proaucmg a ameremiatea zero-crossing rate signal, m mc ^^^^ ^ ^^^^ of decision values is referred to 
preferred embodiment of the mvention, the differentiator lo ^ ^ ^^^^^^ signal The decision signal is supplied to the 

performs a first difference for the sake of computational ^^gj decision unit 430 

efficiency. However in alternate embodiments, any numeri- decision unit 430 uses the decision signal to 

cal differentiation algonthm may be employed, subject to produce a sequence of final decision values. To calculate the 

fundamental design constraints for computational efficiency final decision values, the final decision unit 430 employs a 
and accuracy. 15 moving window of K successive decision values from the 

Referring now to FIG. 4, a block diagram of the discrimi- decision signal. Namely, a final decision value is calculated 

nator 130 according to the present invention is shown. The by counting a number of the K successive decision values 

discriminator 130 uses the differentiated zero-crossing rate which indicate the absence of speech. If the resultant number 

signal to determine the instantaneous presence or absence of ^ l^S^r than a first threshold J, then the final decision value 

speech in the input signal. An output signal, reflecting the is assigned a symbol indicating the absence of speech. If the 

instantaneous presence or absence of speech in the input resultant number is less than a second threshold I, then the 

signal, is provided by discriminator 130 via output 140. The ^^^^ decision value is assigned a symbol indicating the 

discriminator 130 includes a magnitude integration unit 410, presence of speech. The integers I and J are system defined 

a threshold detector 420, and final decision unit 430. The constants with I less than or equal to J. The use of two 

magnitude integration unit 410 is coupled to receive the distinct thresholds adds some hysteresis to the final decision 

differentiated zero-crossing rate signal from the differentia- process and aids in the prevention of spurious changes. The 

tion unit 120. Also the magnitude integration unit 410 is sequence of final decision values is referred to as a final 

coupled to the threshold detector 420. ITie threshold detector decision signal. The final decision signal is asserted as the 

420 is coupled to the final decision unit 430, and the final ouipni of the final decision unit 430 via output 140. 

decision unit 430 provides is coupled to output 140. preferred embodiment of the invention, the speech 

The magnitude integration unit 410 performs a short-time ^^^^^^ °P^^^^^^ ^ telephone answering 

magnitude integration on the differentiated zero-crossing ^f'^^' *° important to detect the termination 

rate signal. Thus, each output value from the magnitude speech message so as to conserve storage space in the 

integration unit 410 is computed by integrating the absolute i"^"'^^ "^^'^ ^^^^ =*P^^^h message. However 

value of the differentiated zero-crossing rate signal over a ^^'^ answermg machme capture the whole 

corresponding window (of length P samples). In the pre- ^^^f message. Thus the speech signal detector 100 must 

ferred embodiment of the invention, the integral is per- g^ard against premature/false detection of the end of the 

formed using the "leaky integrator** given by the transfer ^P^^^^ message. Decreasmg the value of the first threshold 

fyjjgjjQjj J increases the probabihty of detecting the absence of 

40 speech. However increasing the value of threshold J 

J ^ decreases the probability of false detection of the absence of 

Hiz) = . speech. The value of J must be chosen to balance these 

^ ~ " competing requirements. In the preferred embodiment, K is 

chosen to equal 20, J is chosen to equal 16, and I chosen to 
In other words, if y(n) represents the value of an integral as 45 equal 14. 

it accumulates through the sample window, and x(n) repre- Referring now to FIG. 5, a speech storage device 500 

sents the differentiated zero-crossing rate signal, the leaky according to the present invention is shown. The speech 

integration is governed by the recurrence relation storage device 500 comprises an input 105, speech signal 

f 1^- r wi Mrf^\ detector 100 (of FIG. 1), memory media 510, and control 

y{n+j)'a y[nMi-ay\x{n)\. ^^Q. The input 105 is coupled to the speech signal 

At the beginning of the sample window, the cumulative detector 100 and to memory media 510. The speech signal 

integral y(n) is initialized to zero. Then the recursive expres- detector 100 is coupled to the memory media 510 via control 

sion above is applied for every sample x(n) in the P-sample line 520. An input signal is supplied to the speech storage 

window. At the end of the sample window, the resultant device via input 105. It is assumed that at least a portion of 
value of the accumulated integral is reported as the output 55 the input signal contains a speech signal. The memory media 

value. The cumulative integral y(n) is then re-initialized to 510 is operable to store the input signal. The speech signal 

zero for the next sample window integration. The output of detector 100 is operable to detect the initiation/termination 

the magnitude integration unit 410, referred to as the detec- of the speech signal within the input signal as described 

lion signal, is fed to the threshold detector 420. above. The control line 520 is identical to the output 140 (of 

In an alternate embodiment of the invention, the integra- 60 FIG. 1) of the speech signal detector 100. The speech signal 

tion over a sample window referred to above is performed by detector 100 provides an output signal via control line 420 

an FIR filter. In this case, the output value is a weighted indicatinginiliation/terminationof the speech signal, and the 

average of the absolute values of the samples in the sample output signal is used to control the storage of the input signal 

window. into the memory media 510. In particular, storage is enabled 

In yet another embodiment of the invention, the absolute 65 when the output signal indicates initiation of the speech 

value mentioned above is replaced by a square. In this case signal, and disables storage when the output signal indicates 

the output values comprise energy measurements. termination of the speech signal. 
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Referring now to FIG. 6, a block diagram of a telephone a zero-crossing rate calculator coupled to said input for 

answering device 600 according to the present invention is connputing a zero-crossing rate signal based upon said 

shown. The telephone answering device 600 comprises an input signal; 

interface unit 610, a control unit 620, a speaker 630, a a differentiation unit coupled to said zero-crossing rate 

noicrophone 635, a control panel 640, speech signal detector 5 calculator which receives said zero-crossing rate signal 

100, and memory media 650. The interface unit 610 is from said zero-crossing rate calculator, wherein the 

coupled to a central ofi&ce of an external telephone system differentiation unit is configured to perform a differen- 

via a telephone line 602. Interface unit 610 is coupled to tiation operation with respect to time to produce a 

control unit 620, speech signal detector 100 (as illustrated in differentiated zero-crossing rate signal; 

FIG. 1, and described in detail above), speaker 630, micro- a discriminator coupled to said differentiation unit which 

phone 635, and memory media 650. Control unit 620 is receives said differentiated zero-crossing rate signal, 

coupled to control panel 640. It is noted that control panel wherein said discriminator comprises a magnitude inte- 

640 may comprise a graphical user interface (GUI) of a gration unit which is configured to integrate an absolute 

computer system (not shown). Control unit 620 is also value of said differentiated zero-crossing rate signal to 

coupled to speech signal detector 100 and memory media generate a series of resultant values, wherein said 

^ discriminator determines initiation/termination of said 

If a user of telephone answering device 600 does not speech signal within said input signal based on the 
answer an incoming telephone call within a predetermined series of resultant values- 
number of ring signals, telephone answering device 600 therein said discriminator generates an output signal 
answers the incoming telephone call. Answering the tele- indicating initiation/termination of said speech siinal 
phone call inchides the telephone answering device 600 20 ^^hin said input signal, wherein said output signS is 
smiulating an "off-hook condition. Telephone answering to control storage of said speech signal, 
device 600 then transmits a pre-rec«rded outgoing voice 2. The system of claim 1, wherein said differentiation unit 
menage over telephone Ime 602. Telephone answering includes a smoothing filter, wherein said smoothing filter 
device 600 then stores a calling party's audible response smoothes said zero-crossing rate signal and thereby pro- 
(i.e., an incoming voice message) into memory media 650. 25 duces a filtered zero-crossing rate signal, wherein said 

Speech signal detector 100 receives a digitized telephone differentiation unit performs said differentiation operation 

signal from interface unit 610, and provides to control unit with respect to time on said filtered zero-crossing rate signal 

620 a control signal which indicates the termination of the to produce the differentiated zero-crossing rate signal, 

speech message (in the telephone signal input). The tele- . 3. The system of claim 2, wherein said smoothing filter 

phone answering device 600 disables storage when the 30 comprises a median filter. 

control signal indicates termination of the speech message. 4. The system of claim 2, wherein said differentiation unit 

Referring now to FIG. 7, a block diagram of a preferred calculates a first difference on said filtered zero-crossing rate 

embodiment of the speech signal detector 100 according to signal to produce said differentiated zero-crossing rate sig- 
the present invention is presented. In this embodiment, the 

speech signal detector 100 comprises: a threshold input unit 35 ^' ^y^^^™ ^^^^ 1* wherein the input signal 
710; a functional block 720 which counts the number of comprises a sequence of input samples, wherein said zero- 
samples for achieving a specified number of zero-crossings; crossing rate calculator includes a false-crossing pre-filter, 
a 3-tap median filter 730; a first difference operation 740; an ^^"'^'k ^^^e-crossing pre-filter modifies the input 
absolute value calculation 750; a leaky integrator 760; and '|f°f T^^ning a zero value to an input sample if the 
o TTn K » * *u t \ i- • 1 1 I .1. absolute value of the input sample IS below a pre-determined 
a block 770 which tests the detecUon signal and makes the 40 threshold, wherein said false-crossing pre-filter produces a 
vox (voice acUvity) de™ modified input signal, wherein said zero-crossing rate signal 

Threshold mputumt 710 IS Identical to false crossing ^ computed based on said modified input signal, 
pre-filter 210 of HG. 2. The function block 720, which 6. The system of claim 1, wherein the input signal 
counts the number of samples for achieving a specified comprises a sequence of input samples, wherein said zero- 
number of zero-crossings, is identical to zero-crossing rate 45 crossing rate calculator generates a sequence of sample 
measurement unit 220 of FIG. 2. The 3-tap median filter 730 counts, wherein each sample count of said sequence of 
is a realization of the smoothing filter 310 of FIG. 3. The first sample counts represents the number of said input samples 
difference operation 740 is a realization of differentiator 320 required for the occurrence of L successive zero-crossings in 
of FIG. 3. The absolute value calculation 750 and the leaky said input signal, wherein L is a pre-defined positive integer, 
integrator 760 are together equivalent to the magnitude 50 wherein said sequence of sample counts comprises said 
integration unit 410 of FIG. 4. The block 770, which tests the zero-crossing rate signal. 

detection signal and makes the vox (voice activity) decision, 7. The system of claim 1, wherein the input signal 

is equivalent to a combination of the threshold detector 420 comprises a sequence of input samples, wherein said zero- 

and the final decision unit 430 of FIG. 4. crossing rate calculator generates a sequence of zero- 

Although the system and method of the present invention 55 crossing counts, wherein each zero-crossing count of said 

has been described in connection with the preferred sequence of zero-crossing counts represents the number of 

embodiment, it is not intended to be limited to the specific zero-crossings occurring in M successive samples of said 

forms set forth herein, but on the contrary, it is intended to input signal, wherein M is a pre-defined positive integer, 

cover such alternatives, modifications, and equivalents, as wherein said sequence of zero-crossing counts comprises 

can be reasonably included within the spirit and scope of the 60 said zero-crossing rate signal. 

invention as defined by the appended claims. 8. The system of claim 1, wherein said magnitude inte- 

I claim: gration unit is configured to calculate each resultant value of 

1. A system for detecting initiation/termination of a said series of resultant values by integrating absolute values 

speech signal for a speech storage device, the system com- of P consecutive samples of said differentiated zero-crossing 

P"^*"S* 65 rate signal, wherein P is a system specified integer constant, 

an input for receiving an input signal, wherein at least a wherein said series of resultant values comprises a detection 

portion of said input signal includes a speech signal; signal; 
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wherein said discriminator further comprises a threshold 
detector coupled to said magnitude integration unit, 
wherein said threshold detector compares said resultant 
values comprising said detection signal with a thresh- 
old value, and generates a sequence of first decision 
values, wherein a first decision value indicates the 
presence of said speech signal if a respective resuhant 
value exceeds said threshold, and wherein the first 
decision value indicates the absence of said speech 
signal if the respective resuhant value docs not exceed 
said threshold, wherein said sequence of first decision 
values comprises a first decision signal. 

9. The system of claim 8, wherein said discriminator 
operates on said first decision signal to produce a second 
decision signal, wherein said second decision signal com- 
prises a sequence of second decision values, wherein a 
second decision value is determined using K successive 
values of said first decision signal, wherein K is a pre- 
defined integer constant, wherein said discriminator deter- 
mines a number of said K successive values which indicate 
presence of said speech signal, and uses said number to 
determine said second decision value, wherein said second 
decision value indicates either presence or absence of said 
speech signal, wherein said second decision signal com- 
prises said output signal of said discriminator. 

10. The system of claim 1, wherein said system is 
comprised in a speech storage device, wherein said speech 
storage device receives and stores said input signal; 

wherein said speech storage device receives firom said 
discriminator said output signal indicating initiation/ 
termination of said speech signal within said input 
signal, and uses said output signal to control storage of 
said input signal, wherein said speech storage device 
disables storage of said input signal when said output 
signal indicates termination of said speech signal, and 
enables storage of said input signal when said output 
signal indicates initiation of said speech signal. 

11. A method for detecting initiation/termination of a 
speech signal for a speech storage device, the method 
comprising: 

receiving an input signal, wherein at least a portion of said 

input signal includes a speech signal; 
calculating a zero-crossing rate signal based on said input 

signal; 

performing a differentiation operation with respect to time 
to generate a differentiated zero-crossing rate signal; 

integrate an absolute value of the differentiated zero- 
crossing rate signal in order to compute a series of 
resultant values; 

determining initiation/termination of said speech signal 
based on said series of resultant values, wherein said 
determining initiation/termination of said speech signal 
includes generating a control signal which indicates 
initiation/termination of said speech signal; 

wherein said control signal is used to control storage of 
said speech signal. 

12. The method of claim 11, wherein said performing a 
differentiation operation comprises: 

smoothing said zero-crossing rate signal and thereby 
producing a filtered zero-crossing rate signal; 

differentiating said filtered zero-crossing rate signal with 
respect to time in order to generate the differentiated 
zero-crossing rate signal. 

13. The method of claim 12, wherein said smoothing said 
zero-crossing rate signal comprises applying a median filter 
algorithm to said zero-crossing rate signal. 
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14. The method of claim 12, wherein said differentiating 
said filtered zero-crossing rate signal with respect to time 
comprises performing a first difference on said filtered 
zero-crossing rate signal. 

15. The method of claim 11, wherein said input signal 
comprises a sequence of input samples, wherein said calcu- 
lating a zero-crossing rate signal based on said input signal 
includes: 

modifying said input signal by assigning a zero value to 
an input sample if the absolute value of the input 
sample is below a pre-determined threshold, wherein 
said modifying produces a modified input signal; 

wherein said zero-crossing rate signal is based on said 
modified input signal. 

16. llie method of claim 11, wherein the input signal 
comprises a sequence of input samples, wherein said calcu- 
lating a zero-crossing rate signal comprises generating a 
sequence of sample counts, wherein each sample coimt of 
said sequence of sample counts represents the number of 
said input samples required for the occurrence of L succes- 
sive zero -crossings in said input signal, wherein L is a 
pre-defined positive integer, wherein said sequence of 
sample counts comprises said zero-crossing rate signal. 

17. The method of claim 11, wherein the input signal 
comprises a sequence of input samples, wherein said calcu- 
lating a zero-crossing rate signal comprises generating a 
sequence of zero-crossing counts, wherein each zero- 
crossing cotmt of said sequence of zero-crossing counts 
represents the number of zero-crossings occurring in M 
successive input samples of said input signal, wherein said 
sequence of zero<rossing counts comprises said zero- 
crossing rate signal. 

18. The method of claim 11, wherein said integrating the 
absolute value of the zero-crossing rate signal comprises 
computing each of the resultant values by integrating P 
consecutive samples of said differentiated zero-crossing rate 
signal, wherein P is a system specified integer constant, 
wherein said series of resultant values comprises a detection 
signal; 

wherein said determining initiation/termination of said 
speech signal based on said series of result values 
comprises comparing said resultant values comprising 
said detection signal with a threshold value, and gen- 
erating a sequence of first decision values, wherein a 
first decision value indicates the presence of said 
speech signal if a respective resultant value exceeds 
said threshold, and wherein the first decision value 
indicates the absence of said speech signal if the 
respective value does not exceed said threshold, 
wherein said sequence of first decision values com- 
prises a first decision signal. 

19. The method of claim 18, wherein said determining 
initiation/termination of said speech signal based on said 
differentiated zero-crossing rate signal further comprises: 

producing a sequence of second decision values using 
said first decision signal, wherein each second decision 
value is produced using a corresponding window of K 
successive first decision values from said first decision 
signal, wherein K is a pre-defined integer constant, 
wherein producing a second decision value comprises: 
determining a number of said K successive values 
which indicate presence of said speech signal; and 
using said number to determine said second decision 
value, wherein said second decision value indicates 
either presence or absence of said speech signal; 
wherein said second decision signal comprises said con- 
trol signal. 
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20. The method of claim 11, wherein said method operates 
in a speech storage device, the method fiirther comprising: 

storing said input signal in response to said control signal 

indicating initiation of said speech signal; 
discontinuing said storing said input signal in response to ^ 

said control signal indicating termination of said speech 

signal. 

21. A system for detecting termination of a speech mes- 
sage for a speech storage device, the system comprising: 

an input for receiving an input signal, wherein at least a 
portion of said input signal includes a speech message 
signal; 

a zero-crossing rate calculator coupled to said input for 
computing a zero-crossing rate signal based upon said 15 
input signal; 

a differentiation unit coupled to said zero-crossing rate 
calculator which receives said zero-crossing rate signal 
from said zero-crossing rate calculator, wherein the 
differentiation unit is configured to perform a differen- 20 
tiation operation with respect to time to produce a 
differentiated zero-crossing rate sign; 

a discriminator coupled to said differentiation unit which 
receives said differentiated zero-crossing rate signal, 
wherein said discriminator comprises a magnitude inte- 25 
gration unit which is configured to integrate an absolute 
value of said differentiated zero-crossing rate signal to 
generate a series of resultant values, wherein said 
discriminator determines termination of said speech 
message signal within said input signal based on the 30 
series of resuhant values; 

wherein said discriminator generates an output signal 
indicating termination of said speech message signal, 
wherein said output signal is used to control storage of 
said speech message signal, 

22. A telephone answering device comprising: 
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an input for receiving an input signal, wherein at least a 
portion of said input signal includes a speech message 
signal; 

a memory media which receives and stores said input 
signal; 

a message-termination detector coupled to said input, and 
operable to determine termination of said speech mes- 
sage signal within said input signal, wherein said 
message-termination detector generates a control signal 
indicating termination of said speech message signal; 
wherein said telephone answering device discontinues 
storage of said input signal in said memory media in 
response to said control signal indicating termination of 
said speech message signal; 
wherein said message-termination detector comprises: 
a zero-crossing rate calculator coupled to said input for 
computing a zero-crossing rate signal based upon 
said input signal; 
a differentiation unit coupled to said zero-crossing rate 
calculator which receives said zero-crossing rate 
signal from said zero -crossing rate calculator, 
wherein the differentiation unit is configured to per- 
form a differentiation operation with respect to time 
to produce a differentiated zero -crossing rate signal; 
a discriminator coupled to said differentiation unit 
which receives said differentiated zero-crossing rate 
signal, wherein said discriminator comprises a mag- 
nitude integration unit which is configured to inte- 
grate an absolute value of said differentiated zero- 
crossing rate signal to generate a series of resultant 
values, wherein said discriminator determines termi- 
nation of said speech message signal within said 
input signal based on the series of resultant values. 

* * * « * 
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